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Abstract 

Modeling and forecasting covariance matrices of asset returns play a crucial role in finance. 
The availability of high frequency intraday data enables the modeling of the realized covari¬ 
ance matrix directly. However, most models in the literature suffer from the curse of dimen¬ 
sionality. To solve the problem, we propose a factor model with a diagonal CAW model for 
the factor realized covariance matrices. Asymptotic theory is derived for the estimated param¬ 
eters. In an extensive empirical analysis, we find that the number of parameters can be reduced 
significantly. Furthermore, the proposed model maintains a comparable performance with a 
benchmark vector autoregressive model. 
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1. Introduction 


Modeling and forecasting covariances or volatility matrices of asset returns play a crucial role in 
many financial fields, such as portfolio allocation (Markowitz, 1952) and asset pricing (Boller- 
slev et ah, 1988). With the availability of intraday financial data nowadays, it becomes possible 
to estimate volatilities and co-volatilities of asset returns using high-frequency data directly, 
which leads to the so-called realized covariance matrix (Andersen et ah, 2003, Barndorff- 
Nielsen and Shephard, 2004 and Barndorff-Nielsen et ah, 2011). Two major problems arise 
in the estimation of realized covariance matrices. Firstly, transactions for different assets are 
typically asynchronous so that the high-frequency prices of different assets do not change si¬ 
multaneously. Secondly, it is widely believed that the observed high-frequency prices are ac¬ 
companied by microstructure noise so that the observed prices should be thought as a noisy 
version of the true underlying price process. Researchers have proposed several ways to tackle 
these problems, for example, the overlap intervals and previous tick method by Hayashi and 
Yoshida (2005) and Zhang (2011), respectively. Moreover, Bannouh et al. (2012) use a refresh 
time scheme and Christensen et al. (2010) propose the pre-averaging approach. 

Once constructed, realized covariance matrices are analyzed using multivariate models. 
There are two major issues to be resolved in modeling realized covariance matrices. The first 
issue is that the model should guarantee the positive definiteness of fitted covariance matrices. 
A natural choice in this aspect is the family of matrix-valued Wishart distributions which auto¬ 
matically generates random positive definite matrices without imposing additional constraints. 
Several models related to the Wishart distribution have been put forward. Gourieoux et al. 
(2009) propose the Wishart Autoregressive (WAR) model where the realized covariance matrix 
has a conditional distribution which is noncentral Wishart with a non-centrality parameter de¬ 
pending on lagged covariances and a fixed scaling matrix. Later, Golosnoy et al. (2012) propose 
the Conditional Autoregressive Wishart (CAW) model under which the conditional distribution 
is central Wishart with time dependent scaling matrices. Moreover, Ng et al. (2014) generalize 
the above two models to construct Generalized Conditional Autoregressive Wishart (GCAW) 
model. There are other models involving the Wishart distribution, for instance, see Jin and 
Maheu (2009). 
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The second issue in modeling realized covariance matrices is the high-dimensionality. In¬ 
deed, covariance matrices have did + l)/2 entries for d assets; consequently, the number of 
parameters in the model for realized covariance matrices grows quickly with d. For example, 
for an unrestricted CAW(2,2) model with d = 10 assets, as many as 456 parameters are needed 
so that it is quite challenging to fit such a model in practice. This is probably the major reason 
why all the empirical studies we find in the literature on model fitting for realized covariance 
matrices are all limited to a small number, say 3 to 5 assets. Another problem inherent in re¬ 
alized covariance matrices is that they deviate from their population counterpart, the so-called 
integrated covariance matrix, when the number of assets is large compared to the sample size 
(Johnstone and Lu, 2009; and Wang and Zou, 2010). It is therefore important to build statistical 
models for realized covariance matrices with a large dimension, say several tens. 

Improved estimators of realized covariance matrices are proposed in Wang and Zou (2010) 
with a so-called averaging realized volatility matrix (ARVM) estimator. Tao et al. (2011) pro¬ 
pose the threshold averaging realized volatility matrix (TARVM) estimator which is of two- 
scale and uses the previous-tick method and the threshold technique in constructing realized 
volatility matrices. Then, inspired by Zhang (2006) and Fan and Wang (2007), Tao et al. 
(2013) propose the threshold multi-scale realized volatility matrix (TMSRVM) estimator. The 
TARVM and TMSRVM estimators are shown to be consistent for the integrated covariance ma¬ 
trix when the dimension of the realized covariance matrix, the sample size of intraday points 
and the length of sampling days go to infinity. In addition, the TMSRVM estimator proves to 
have the optimal convergence rate under the existence of the microstructure noise. 

As an effort to control the parametric dimension of the models, Tao et al. (2011) propose to 
first identify a small number of factors for the realized covariance matrices and to fit a vector 
autoregressive (VAR) model to the vectorized factor covariance matrices. They show that such 
a factor model significantly reduces the number of parameters needed to fit the realized covari¬ 
ance matrices. However, the VAR specification is not able to ensure the positive definiteness 
of the predicted factor covariance matrices. In addition (see below), such a VAR fit still needs 
Oir‘^) number of parameters with r factors. 

In this article, we adopt the factor model approach introduced in Tao et al. (2011) for re- 
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alized covariance matrices. In order to overeome the aforementioned weakness of their VAR 


fit for the extraeted faetors, we propose a diagonal CAW model whieh has several advantages. 
Firstly, the proposed CAW model is able to guarantee automatically the positive definiteness 
of the eovarianee matriees generated from the model without imposing additional eonstraints. 
Seeondly, as will be shown by extensive data analysis reported in this paper, our model has 
exeellent empirieal performanee in terms of the reduetion of number of parameters eompared 
to the VAR approach proposed in Tao et al. (201 1): indeed we obtain comparable foreeasting 
performanee with much less parameters. 

In a related work, Asai and MeAleer (2014) also use a eombination of faetor extraetion 
and CAW modeling and report some empirieal studies with 7 assets whieh is still eonsidered 
to be small dimension. In this paper, we foeus on a larger eollection of assets where empirical 
studies are earried out for 30 assets. A further differenee in this paper is that we also propose a 
thorough theoretical analysis of both the factor modeling and the CAW estimation. 

The rest of the paper is organized as follows. Seetion 2 introduees the model setup and 
our approaeh based on a faetor model and a diagonal CAW model for the extraeted faetors. In 
Seetion 3, the asymptotie theory is established. In Seetions 4 and 5, we report the middle and 
large scale data analysis on asset priees, respeetively. Conelusions are presented in Seetion 6. 
Proofs of the asymptotic theory are provided in the Appendix. 


2. Methodology 

2.1. Model setup 

Suppose there are d assets and their log priee proeess X(t) = {Ai(t), ,Xd{t)]' follows a 

eontinuous diffusion model: 


JX(t) = p,dt + cr,dW,, t 6 [0, T], 


( 1 ) 
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where ju, is a drift in W, a standard J-dimensional Brownian motion and cFtSidxd matrix. 
The integrated volatility matrix for the t-th day is defined as 



( 2 ) 


However, it is eommonly admitted that the mierostrueture noise is inherent in the high-frequeney 
priee proeess so that we are not able to observe direetly Xi{t), but Yi{tii), a noisy version of X,(-) 
at times tj^, £ = 1, • ■ ■ z = - ■ ,d. Here, zz, is the total trading times and tu is the ^-th 

trading times of asset i during a giving trading day t. The observations is allowed to be 
non-synehronized, i.e. tu tjt for any i j. In this paper, we assume that 


Yi{Ut) - Xi(tu) + ei{tit) 


( 3 ) 


where eiitu) are i.i.d. mierostrueture noise with mean zero and varianee zy,, and 6,(-) and X,(-) 
are independent with eaeh other. 

2.2. Realized covariance matrix estimator 

Several issues arise for the estimation of 1. asynehronous observations of different as¬ 
sets; 2. mierostrueture noise; 3. the number of assets ean be larger than the sample size. In 
this paper, we adopt the threshold Multi-Seale Realized Volatility Matrix estimator (thresh¬ 
old MSRVM) proposed by Tao et al. (2013), denoted by £x(0, t = The threshold 

MSRVM estimator has many attraetive properties, for instanee, it is eonsistent for the high¬ 
dimensional integrated eo-volatility matrix with the optimal eonvergenee rate. Briefly, the idea 
of the threshold MSRVM estimator is the following: the previous-tiek method is used to eon- 
struet the raw realized eovarianee matriees. Then, a multi-seale estimator is evaluated whieh is 
aetually a kind of average of those raw estimators. In addition, the multi-seale estimator is reg¬ 
ularized using a thresholding method, that is, matrix entries under a threshold are set to be zero. 
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2.3. Matrix factor model 

We adopt the faetor model proposed by Tao et al. (2011) to reduee the large dimension of 


21,(0 = AHOOA' + Ho, (4) 

for t = 1, • • • , r, where 21/(0 are r x r positive definite factor covariance matrices, 21o is a 
dxd positive definite constant matrix and A is a J x r factor loading matrix normalized by the 
constraint A'A = 1,. As in a standard factor model, only the left-hand side of Equation (4) is 
observed. The unknown quantities are estimated using the method in Tao et al. (2011). Let 

Y T _ 1 r 

H, = - ^ 21,(0, S, = - ^{21,(0 - H,}2, (5) 

t=\ t=i 

and 

1 7- _ 1 r 

21, = - 2 21,(0, s, = - - S.}'- ( 6 ) 

t=\ 

Next, the estimator A is obtained using the r orthonormal eigenvectors of S„ corresponding to 
its r largest eigenvalues, as its columns. Finally, the estimated factor covariance matrices are 

21/(0 = A'21,(0A (7) 

for t = 1, • • • ,T; and 21o is estimated by 

/V /V /V /V /V /V /V 

21o = 21, - AA'21,AA'. (8) 

2.4. CAW modeling for factor covariance matrix 

With the estimated factor covariance matrices {21/(0} calculated by (7) and (8), we construct a 
dynamic structure by fitting a diagonal CAW model to 21/(0, where 21/(0 := ^/(O + A'21oA. 
Here, 21/(0 is modeled rather than 21/(0 since 21/(0 is in fact a consistent estimator of the 
former, and it is impossible to construct a consistent estimator for the latter, to our knowledge. 
The model is defined as follows. Let Tt-\ = (T{%f{s),s < t) be the past history of the 
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process at time t. Conditional on ^/(t) follows a eentral Wishart distribution 




(9) 


with y the degrees of freedom and the sealing matrix Sf(t). Moreover, the sealing matrix Sf(t) 
follows a linear reeursion of order (p, q) 



( 10 ) 


where Aj, 5, and C are all r x r matriees of eoeffieients. 

In summary, the CAW proeess depends on the parameters {v, C, (5,)i<,<p, (Ay)i<y<^} without 


additional eonstraints, so that the total number of parameters is equal to {p + q)r^ + + 1 = 

0{r^) whieh still grows quiekly with the number of faetors r and the order p and q. Sinee 


the main aim of the paper is to propose a praetieally feasible model for a large number of 
assets while retaining effieieney, we will restriet ourselves to diagonal eoeffieient matriees C, 
(5,)i</<p and (Aj)i<y<^. Therefore, the number of parameters beeomes {p + q + l)r + I = 0{r). 
Notiee that this setup is also supported by the faet that in the literature (MeCurdy and Stengos, 
1992, Engle and Kroner, 1995), researehers tend to use diagonal volatility models to avoid 
overparameterization and argue that the varianees and the eovarianees rely more on its own 
past than the history of other varianees or eovarianees. In the empirieal study developed below, 
we find that the diagonal models aehieve a eomparable performanee with unrestrieted ones, 
while being mueh more parsimonious and requiring far less eomputing time. Notiee however, 
the asymptotie theory developed below is also valid for unrestrieted matriees Ay’s, 5,’s and C. 

The estimation of the parameters 6 = (v, diag(C)', diag(5,)(<^.^p, diag(Ay)(^^.^^)' of the di¬ 
agonal CAW(/), q) model is earned out by maximizing the log-likelihood funetion using the 
Broyden-Fleteher-Goldfarb-Shanno (BEGS) optimization proeedure. Positivity of the diagonal 
elements Cm, An y and Bn,; are enforeed, where 1 < k < r. The log-likelihood funetion is 
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(11) 


+( ^ ^ S ln|£/(OI - ^tr(yS/(0-'2/(0)} 

In practice, initial values for S/(t) are needed to run the maximization of the log-likelihood 
function. For example, if the order (p, q) = (2,2) is used, then the initial values S/(l) and S/(2) 
are needed. In the empirical analysis of this paper, we take S/(l) = £/(l) and S/(2) = 2/(2) as 
S/(0 is the conditional expectation of 2/(0, for any t. 

3. Asymptotic theory 

Given a J-dimensional vector x = (.ri, • • • , Xd)' and adxd matrix U = (t/,/), define vector and 
matrix norms as 

d 

l|x ||2 = (Xi IIUII 2 = suplllUxlb, ||x ||2 = 1} (12) 

i=\ 

In fact, IIUII 2 is the spectral norm, equal to the square root of the largest eigenvalue of U'U. In 
addition, define the Frobenius norm of the dxd matrix U = (t/,/) as 

d d 

llUllf = 

\ (=1 j=l 

The asymptotic theory below uses the following assumptions. 

(Al) All row vectors of A' and 2o in the factor model (4) satisfy the sparsity condition (13) 
below. We say that a J-dimensional vector x = (.ri, • • • , Xd)' is sparse if 

d 

^ \Xif < Cn(d), (14) 

y=i 

where 0 < d < 1, n(d) is a deterministic function of d that grows slowly in d, such as n{d) = 1 
or ln(J), and C is a positive constant. 

(A2) The factor model (4) has r fixed factors, with A'A = 1^, and matrices 2o and 2/ satisfy 

II 20 II 2 < 00 

max|2/jXt)| = Op(B(T)), j=l,---,r. (15) 
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where 1 < B{T) = o{T). 

(A3) maxi^,;;^ |i 'Lxif) - \\ 2 = Op{A(d,T,n)), for some rate funetion A{d,T,n) sueh that 

Aid, T,n)B\T) = o(l) with BiT) defined in (A2). 

(A4) A/s and 5,’s are sueh that the CAW model (9) and (10) are stationary and ergodie. 

(A5) The parameter set 0 for all parameters A/’s, 5,’s, C and v is eompaet for the CAW model 
(9) and (10). 

(A6) The Hessian matrix d^JLiO)/dOidOj eonverges to some deterministie matrix funetion 0(6) 
as T goes to infinity whieh is of full rank for all 0 6 0. 

The first two eonditions are from Tao et al. (2011) whieh are used to prove the eonsis- 
teney of 1^/(0• In this paper, as we are using the threshold MSRVM estimator, we take 
Aid,T,n) = nid){enicP'Ty^^V^~^\nT and BiT) = InT, where ~ eonsistent with Tao 

et al. (2013). In addition, the eonstant yS ean be taken large enough (under reasonable moment 
eonditions) so that Aid, T, n)B^iT) will go to 0 as n, d, and T go to infinity. Consequently, we 
assume here these two values shall eonverge to 0 in some sense. 


Theorem 1. Suppose the models il), i3) and i4) satisfy Conditions iAl)-iA3). Denote the or¬ 
dered eigenvalues ofS^ by X\ >■■■> Ad- Let ai, • ■ • , he the eigenvectors ofS^ corresponding 
to the r largest eigenvalues di, ■ • • , d^. Also 5et di > ■ • ■ > d,. be the r largest eigenvalues ofSx 
and ai, • ■ • , a,, the corresponding eigenvectors. Let A = (ai, • ■ • , a^) and A = (ai, ■ • ■ , a^). A^ 
n, d, T go to infinity, we have 


A'A-I, = OpiAid,T,n)BiT)), 

tfit) - Hfit)-A'HoA = OpiA^I\d,T,n)B^I\T)). (16) 


Theorem 2. Suppose that 6 is the maximized log-likelihood estimator of 0 based on the data 
fit) from the CAW model and 6 is the maximized log-likelihood estimator based on the true 
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data 'Lf{t)from the same CAW model. Then under Conditions (A1)-(A5), 


e-e = OpiA^'\d, T,n)B^'\T)) 


(17) 


4. Data analysis 1 

We apply the proposed methodology to two datasets. In this seetion, we foeus on eomparing 
the VAR and CAW models, using 5-minutes intraday data of 30 stoeks traded in the US stoek 
market over a period of 103 days where the data are obtained direetly from Bloomberg. 


4.1. Data description 

We use 30 stocks traded at the New York Stock Exchange, which consist of 27 components of 
Dow Jones Index : 3M (MMM), American Express (AXP), AT&T (T), Boeing (BA), Cater¬ 
pillar (CAT), Chevron (CVX), Coca-Cola (KO), Dupont (DD), ExxonMobil (XOM), General 
Electric (GE), Goldman Sachs (GS), The Home Depot (HD), IBM (IBM), Johnson & Johnson 
(JNJ), JPMorgan Chase (JPM), McDonald’s (MCD), Merck (MRK), Nike (NKE), Pfizer (PEE), 
Procter & Gamble (PG), Travelers (TRV), UnitedHealth Group (UNH), United Technologies 
(UTX), Verizon (VZ), Visa (V), Wal-Mart (WMT), Walt Disney (DIS) and three former com¬ 
ponents of Dow Jones Index: Honeywell (HON), Citigroup (C) and American International 
Group (AIG). The daily realized covariance matrix is computed as Xx(0 = Y!f=\ yt.jy'tj, where 
ytj is the vector of log-returns for the 30 stocks computed for the jth 5-minute interval of 
trading day t except the first and last half hour. The sample period starts at May 3, 2013 and 
ends on September 30, 2013, totally 103 trading days. (Here, we exclude July 4 due to the 
incompleteness of data). This generates a series of 103 matrices of £;c(l), which are 30 by 30. 

Descriptive statistics of the realized variances and covariances are provided in Table 1 . We 
only show some entries due to limited space. The following properties are found: 

1. Among 30 realized variances and 435 covariances, only 3 realized covariances are skewed 
to the left rather than to the right. 
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2. All realized varianees and eovarianees have bigger kurtosis than that of the normal dis¬ 
tribution exeept for 2 realized eovarianees. 

For the next two subseetions, we show the estimated results for both the diagonal CAW and 
VAR models when the first k = 98 days are treated as data points. 


4.2. Model fitting 

The eigenvalues of the sample varianee matrix S;,; are evaluated, and are shown in Figure 1. The 
plots show that the three largest eigenvalues are mueh larger than the others, whieh indieates 
that the faetor number r = 3 is appropriate. 

Let A be the eigenveetor of eorresponding to the three largest eigenvalues. We then 
ealeulate the faetor volatility matriees 2/(0 > whieh are 3 by 3. Figure 2 shows time series plots 
of the varianees and eovarianees of the faetor eovarianee matriees. 

Then, we fit the diagonal CAW model to the matrix series. Different orders are used to 
make eomparison, namely {p,q) = (0,1), {p,q) = (1,1), {p,q) = (1,2), {p,q) = (2,1) and 
{p,q) = (2,2). 


Remark 1. The initial value for each numerical optimization is chosen randomly and we repeat 
160 times to choose the one with the largest log-likelihood value. 

4.3. VAR model 

For eomparison purpose, we also fit the VAR model, advoeated in Tao et al. (2011), to the 
veetorized faetor eovarianee matrix ^ft), whieh is a veetor with 6 entries. Using the paekage 
”vars” in R, we seleet the VAR(l) model as all the model seleetion eriteria, namely Akaike 
information eriterion (AIC), Hannan-Quinn (HQ), Sehwarz eriterion (SC) and final predietion 
error (FPE), ehoose the order 1, as shown in Table 2. 

The model is 

veeh{i;/(t)} = Ao -I- Aiveeh{i;/(t - 1)} -i- e(0 
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where Aq is a 6-dimensional veetor, A i is a 6 x 6 square matrix, and e(t) is a 6-dimensional 
veetor white noise proeess with zero mean and finite fourth moments. Both Aq and Ai are 
estimated by the least squares method. 

The fitted VAR(l) model has the eoeffieients: 


/ N 

54.49 


55.8 

- 31.2 

19.6 

- 37.4 

- 199.1 

1 

oo 

o 

- 1.482 


- 4.1 

20.6 

3.2 

7.7 

45.4 

- 2.0 

4.937 

, Ai = 10-2 

6.8 

15.3 

72.8 

- 6.5 

- 230.9 

- 0.1 

7.632 


0.5 

- 26.5 

- 84.6 

4.4 

283.8 

- 6.0 

0.164 


- 0.6 

2.5 

2.2 

1.0 

- 3.7 

- 0.3 

. 10.27 ^ 


. 10-1 

43.5 

- 17.4 

11.1 

- 207.5 

13 . 4 ^ 


The fitted models are shown in Figure 4 whieh indieates that the VAR(l) provides adequate fit 
to the data. For eaeh of the 6 time series, we show the in-sample fit in the upper panel, where 
the solid line stands for the real data and the dashed line is the fitted series. The residuals are 
shown in the middle panel while the ACF and PACF of residuals are displayed in the lower 
panel. The ACF and PACF plots show that the residuals are time-uneorrelated. 


4.4. Performance comparison in out-of-sample forecasting 

We eompare the out-of-sample one-day-ahead foreeast performanee of the diagonal CAW and 
VAR models. The one-day-ahead realized eovarianee is ealeulated by: 1. Obtain the one- 
day-ahead faetor eovarianee matrix by eonditional expeetation; 2. Plug the foreeast faetor 
eovarianee matrix into the faetor model (4) to get the predieted realized eovarianee matrix. 

The predietive aeeuraey is measured with both the Frobenius norm and the speetral norm. 
We take the first k days as data and foreeast the next day, where k = 80, • • • , 98. Every model 
is re-estimated and the new foreeasts are generated based upon the new parameter estimates. 
Then, we take the average of errors during 19 periods to do the eomparison. Moreover, the 
errors of the inverse of matriees are also eompared. 

Table 3 eontains the results of the predietion aeeuraey of the two models using different 
norms. The main findings are as follows. 
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1. The CAW models with order {p,q) = (1,1) performs the best as it has the smallest error 
under both Frobenius and spectral norms. In general, All CAW models have similar 
performance except the ones with order {p, q) = (0,1). 

2. The CAW models have similar performance with that of the VAR(l) model except the 
one with order {p, q) = (0,1). 

3. However, the diagonal CAW model needs far less parameters than the VAR model does. 
Here, the best CAW model with order {p,q) = (1,1) only needs 10 parameters, just a 
quarter of the number of parameters that the VAR(l) model needs. 

4. As d goes up, we can imagine that r will become larger; in this situation, the parameters 
that we need for the diagonal CAW model will be even far less than that we need for the 
VAR model. 

Remark 2. 

1. We do predictions for 2 to 5 days ahead in addition to the one-day forecast. We find that 
the performance of predictions for longer horizons is similar with that for the one day, in 
general. 

2. One problem for the VAR model is that it cannot assure that the predicted factor covari¬ 
ance matrix is positive definite. We have checked that in this data analysis, the predicted 
factor covariance matrices are actually all positive definite. This may be due to the 
reason that the estimated factor variances are larger than the covariances (in absolute 
value). 

5. Data analysis 2 

5.1. Data description 

We use the same 30 stocks as in the previous section. The raw tick-by-tick trading data are 

downloaded from TAQ database of Wharton Research Data Service. The data period starts at 

January 3, 2012 and ends on December 31, 2012, with totally 250 trading days. 
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We firstly conduct data cleaning with the procedures introduced in Brownlees and Gallo 


(2006) and BarndorfF-Nielsen et at. (2009). The steps are the following: 


1. Delete entries with a time stamp outside 9:30 am - 4:00 pm when the exchange is open. 

2. Delete entries with a time stamp inside 9:30 - 10:00 am or 3:30 - 4:00 pm to eliminate 
the open and end effect of price fluctuation. 

3. Delete entries with the transaction price equal to zero. 

4. If multiple transactions have the same time stamp, use the median price. 

5. Delete entries with prices which are outliers. Let be an ordered tick-by-tick price 
series. We treat the z-th price as an outlier if \pi - Pi(k)\ > 3si(k), where Pi{k) and 
Si{k) denote the sample mean and sample standard deviation of a neighborhood of k 
observations around z, respectively. For the beginning prices which may not have enough 
left hand side neighbors, we get k - i neighbors from z -l- 1 to k -l- 1. Similar procedures 
are taken for the ending prices. 

Then, we construct the threshold MSRVM estimator based on the cleaned tick-by-tick data 
following the steps in Tao et al. (2013). We set the threshold to be 5% of the largest of the 
absolute value of entries in the matrix. 

Descriptive statistics of selected realized variances and covariances are provided in Table 4. 
In addition, two plots for realized variances and two plots for realized covariances are shown 
in Figure 5. We find the following properties: 

1. All 30 realized variances and 435 covariances are skewed to the right, with mean skew¬ 
ness 1.39. 

2. All realized variances and covariances have bigger kurtosis than that of the normal dis¬ 
tribution, with mean kurtosis 5.92 showing fat tails. 

3. The realized variances and covariances have significant fluctuations during the year, in¬ 
dicated by the graphs. 
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For the next two subseetions, we show the estimated results for both the diagonal CAW and 


VAR models when the first k = 220 days are treated as data points. 


5.2. Model fitting 

The eigenvalues of the sample varianee matrix are evaluated, and are shown in Figure 5. We 
choose four factors for the model, as there is a dicemable drop between the fourth and the fifth 
eigenvalues, though the second to fourth eigenvalues are much less than the biggest one. 

Let A be the eigenvectors of corresponding to the four largest eigenvalues. We calculate 
the factor volatility matrices 2/(0> which are 4 by 4. Then, we fit the diagonal CAW model 
to the matrix series. Different orders are used to make comparison, namely {p,q) = (0,1), 
{p,q) = (1,1), {p,q) = (1,2), {p,q) = (2,1) and {p,q) = (2,2). For every estimation, we 
randomly choose 60 initial values for each optimization of the log-likelihood and choose the 
one with the largest log-likelihood value to give the estimated parameters. 


5.3. VAR model 

For comparison purpose, we also fit the VAR model to the vectorized factor covariance matrix 
£/(0, which is a vector with 10 entries. Again, using the package ”vars” in R, we select the 
VAR(l) model as all the model selection criteria AIC, HQ, SC and FPE choose the order 1, as 
shown in Table 5. 

The model is 

vech{i;/-(t)} = Aq -I- Aivech{i;y(t - 1)} -i- e(0 

where Aq is a 10-dimensional vector, AiisalOxlO square matrix, and e(0 is a 10-dimensional 
vector white noise process with zero mean and finite fourth moments. Both Aq and Ai are 
estimated by the least squares method. 

^ A « 

We denote the estimator of Aj by Ai. We find that |Ai| = -1.03 x 10“ and the biggest 
absolute eigenvalues is 0.4906 < 1 which ensures the stationarity of the VAR model. 

In addition, the sparsity of Ai is checked. We compute the values Yiij k;,/” for different 
0 < m < 1, where Ai = {a,j}i<,j<io. The following is the result. We can see from Table 6 that 
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the average absolute value of the entries of Ai is small (less than 1) so that we ean eonsider Aj 
as almost sparse. 


5.4. Performance comparison in out-of-sample forecasting 

We eompare the out-of-sample one-day-ahead foreeast performanee of the diagonal CAW and 
VAR models. The predietion of the one-day-ahead realized eovarianee is ealeulated by: 1. Pre¬ 
diet the one-day-ahead faetor eovarianee matrix by eonditional expeetation; 2. Plug the foreeast 
faetor eovarianee matrix into the faetor model (4) to get the predieted realized eovarianee ma¬ 
trix. 

The predietive aeeuraey is measured with both the Frobenius norm and the speetral norm. 
We take the first k days as data and foreeast the next day, where k = 220, ■ • • , 240. Every model 
is re-estimated and the new foreeasts are generated based upon the new parameter estimates. 
Then, we take the average of errors during 21 periods to do the eomparison. Moreover, the 
errors of the inverse of matriees are also eompared. 

Table 7 eontains the results of the predietion error of two models using different norms. The 
main findings are as follows. 

1. The diagonal CAW models with order {p,q) = (1,1) performs the best among the CAW 
models as it has the smallest error under both Frobenius and speetral norms. In general, 
all CAW models have similar performanee exeept the ones with order {p,q) = (0,1). 
The result also indieates the possibility of over-parameterization with orders larger than 
( 1 , 1 ). 

2. The diagonal CAW models have slightly worse, but eomparable performanee with that 
of the VAR(l) model exeept the one with order (p, q) = (0,1). 

3. The diagonal CAW model needs far less parameters than the VAR model does. Here, the 
best CAW model with order {p,q) = (1,1) only needs 13 parameters, nearly a tenth of 
the number of parameters that VAR(l) model needs. 

4. We do predietions for 2 to 5 days ahead in addition to one-day foreeast. We find that 
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the performance of predictions of longer horizons is similar with that for the one day, in 
general. 

6. Conclusions 

In the literature, most models dealing with the realized covariance matrix focus on small num¬ 
ber of assets, which become infeasible when the dimension is large. In order to solve the 
problem, we propose a factor model with diagonal CAW model fitted to the factor covariance 
matrix. Our model performs comparably with the VAR model while requiring far less param¬ 
eters. For example, in the second data analysis, the CAW model with order ip,q) = (1,1) 
performs similarly with the VAR model, measured both in the Frobenius norm and the spectral 
norm, but only needs nearly one tenth of the number of parameters of the latter. In addition, the 
model ensures the positive definiteness for the predicted covariance matrices. Diagnostic tests 
for the proposed model is worth considering in a future study. 


17 



Table 1: Descriptive Statistics for Selected Realized Variances and Covariances 


We report the descriptive statistics of the realized variances and covariances of the first dataset, 
namely mean, maximum, minimum, standard deviation, skewness and kurtosis. We only show 
some entries due to limited space. 


Stock 

Mean 

*10-5 

Maximum 

*10-4 

Minimum 

*10-5 

SD 

*10-5 

Skewness 

Kurtosis 

Realized Variance 

MMM 

3.25 

1.27 

0.63 

1.96 

1.85 

4.84 

AXP 

6.90 

6.05 

2.03 

7.02 

4.74 

31.0 

T 

4.85 

1.36 

1.29 

2.60 

1.15 

0.85 

BA 

8.60 

28.5 

1.31 

27.8 

9.61 

92.8 

CAT 

6.39 

3.70 

1.99 

4.82 

4.43 

23.8 

Realized Covariance 

MMM-AXP 

2.17 

1.23 

-0.25 

2.03 

2.13 

6.27 

MMM-T 

1.40 

0.77 

-0.37 

1.48 

1.86 

3.80 

MMM-BA 

1.94 

0.95 

-2.33 

1.85 

1.36 

2.67 

MMM-CAT 

2.11 

1.08 

-0.03 

1.72 

2.23 

7.01 

AXP-T 

1.96 

0.96 

-0.66 

1.95 

1.70 

3.15 

AXP-BA 

2.18 

1.17 

-7.27 

2.39 

0.45 

3.36 

AXP-CAT 

2.24 

1.14 

-0.69 

2.11 

1.80 

4.37 

T-BA 

1.38 

0.86 

-1.47 

1.59 

1.95 

5.33 

T-CAT 

1.41 

0.84 

-1.13 

1.61 

1.80 

3.95 

BA-CAT 

2.12 

0.98 

-0.32 

1.76 

1.65 

3.98 
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Table 2: The Selection of the Order of VAR Model 


We fit the VAR model to the vectorized factor covariance matrix 2/(0 > which is a vector with 
6 entries. Using the package ”vars” in R, we select the VAR(l) model by Akaike information 
criterion (AIC), Hannan Quinn (HQ), Schwarz criterion (SC) and final prediction error (FPE). 


Order 

1 

2 

3 

4 

5 

AlC(n)*10^ 

-1.107 

-1.102 

-1.098 

-1.094 

-1.090 

HQ(n) *10^ 

-1.102 

-1.093 

-1.084 

-1.077 

-1.068 

SC(n) *102 

-1.094 

-1.079 

-1.065 

-1.051 

-1.036 

FPE(n) *10-4^ 

0.830 

1.313 

2.181 

3.268 

5.772 
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Table 3: Forecast errors for CAW and VAR models using different norms 


We report the results of the prediction accuracy of the two models using different norms. In 
addition, the prediction accuracy of the inverse of the matrices is shown as well. Here, FN is 
for the Frobenius norm and SN for the spectral norm. 


CAW 

Order 


Number of Parameters 

FN 

SN 

FN 

SN 






for Inverse 

for Inverse 




*10-4 

*10-4 

*10^ 

*10^ 

p = 0,q = 

1 

7 

6.21 

5.66 

6.30 

3.84 

p=\,q = 

1 

10 

4.77 

3.95 

6.31 

3.84 

p=l,q = 

2 

13 

4.92 

4.01 

6.31 

3.84 

p = 2, q = 

1 

13 

4.96 

4.01 

6.31 

3.84 

p = 2, q = 

2 

16 

4.98 

4.03 

6.31 

3.84 

VAR 



Number of Parameters 

FN 

SN 

FN 

SN 






for Inverse 

for Inverse 




*10-4 

*10-4 

*10^ 

*10^ 

VAR(l) 


42 

4.81 

3.99 

6.31 

3.84 
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Table 4: Descriptive Statistics for Some Selected Realized Variances and Covariances 


We report the descriptive statistics of the realized variances and covariances of the second 
dataset, namely mean, maximum, minimum, standard deviation, skewness and kurtosis. We 
only show some entries due to limited space. 


Stock 

Mean 

*10-5 

Maximum 

*10-4 

Minimum 

*10-5 

SD 

*10-5 

Skewness 

Kurtosis 

Realized Variance 

AIG 

17.4 

9.16 

3.05 

11.2 

2.47 

13.1 

AXP 

6.12 

6.38 

1.45 

4.71 

7.67 

91.5 

BA 

5.46 

2.19 

1.11 

3.25 

1.70 

7.33 

C 

16.7 

9.62 

2.80 

10.9 

2.56 

15.3 

Realized Covariance 

AIG-AXP 

4.30 

1.65 

-1.99 

3.13 

1.00 

3.76 

AIG-BA 

3.32 

1.32 

-2.74 

2.82 

1.03 

3.93 

AIG-C 

7.72 

4.54 

-1.47 

5.99 

1.85 

9.37 

AXP-BA 

2.41 

1.01 

-0.59 

1.95 

1.26 

4.37 

AXP-C 

5.03 

2.24 

-0.75 

3.53 

1.61 

6.84 

BA-C 

3.69 

1.75 

-1.40 

3.13 

1.66 

6.30 
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Table 5: The Selection of the Order of VAR Model 


We fit the VAR model to the vectorized factor covariance matrix 2/(0 > which is a vector with 10 
entries. Using the package ”vars” in R, we select the VAR(l) model as all the model selection 
criteria, namely Akaike information criterion (AIC), Hannan Quinn (HQ), Schwarz criterion 
(SC) and final prediction error (FPE), choose the order 1. 


Oder 

1 

2 

3 

4 

5 

AlC(n)*10^ 

-1.949 

-1.943 

-1.937 

-1.933 

-1.932 

HQ(n) *10^ 

-1.939 

-1.925 

-1.910 

-1.898 

-1.888 

SC(n) *102 

-1.924 

-1.897 

-1.871 

-1.846 

-1.824 

FPE(n) *10-^4 

0.234 

0.416 

0.796 

1.292 

1.751 
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Table 6: Sparsity of Ai 

We compute the values for different 0 < m < 1, where Ai = {a,;y}i<,;y<io. 


m 0 

0.05 

0.1 

0.15 

0.2 

0.25 

0.3 

—-— ■ ■ 1/7 ■ ’1^ 1 

102 Zj/,/ 

0.8686 

0.7586 

0.666 

0.5877 

0.5212 

0.4646 


23 



Table 7: Forecast errors for CAW and VAR models using different norms 


We report the results of the prediction accuracy of the two models using different norms. In 
addition, the prediction accuracy of the inverse of the matrices is shown as well. Here, FN is 
for the Frobenius norm and SN for the spectral norm. 


CAW 

Order 


Number of Parameters 

FN 

SN 

FN 

SN 






for Inverse 

for Inverse 




*10~4 

*10-4 

*10^ 

*10^ 

p = 0,q = 

1 

9 

6.10 

5.58 

7.30 

4.31 

p=\,q = 

1 

13 

5.27 

4.80 

7.31 

4.31 

p=l,q = 

2 

17 

5.28 

4.81 

7.31 

4.31 

p = 2, q = 

1 

17 

5.28 

4.83 

7.31 

4.31 

p = 2, q = 

2 

21 

5.28 

4.82 

7.31 

4.31 

VAR 



Number of Parameters 

FN 

SN 

FN 

SN 






for Inverse 

for Inverse 




*10-4 

*10-4 

*10^ 

*10^ 

VAR(l) 


no 

5.18 

4.76 

7.31 

4.30 


24 




Eigenvalues 


h- 

O 

I 

a> 

NT 


r- 

o 

I 

<D 

CO 


r- 

o 

I 

a> 

CN 


r-- 

o 


o 

o 

+ 

(D 

O 



Figure 1: Plots of the eigenvalues of S;^ for the dataset when k = 9S. 


25 



-5e-05 1e-04 -3e-04 2e-04 0.0000 0.0025 


11 


21 



Figure 2: Time series of varianees and eovarianees for faetor eovarianee matriees. 
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Diagram of fit and residuals for y1 
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Diagram of fit and residuals for y2 
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Diagram of fit and residuals for y3 
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Diagram of fit and residuals for y4 
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Diagram of fit and residuals for y5 
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Diagram of fit and residuals for y6 
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Figure 4: The plot of fitted VAR(l) model, residuals of fitted VAR(l) model, and ACF and 
PACF of the residuals. 
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Figure 5: The plot of selected realized variances and covariances. 
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Figure 5: Plots of the eigenvalues of for the dataset when k = 220. 


32 



Appendix A. Proof of theorems 


For convenience, we denote A{d, T, n) and B{T) by A and B in the proof part. 


Proof of Theorem 1. Following Theorem 1 in Tao et al. (2011), we can easily show that || 
k - S. || 2 = OpiAB). 


Then, we claim that 


A 

max\<j<r I Aj - Aj |= Op(AB) 


(A.l) 


max\<j<r II ay - ay |i2= Op{AB) (A.2) 

A — A 

Let P = Sx - S;c with ordered eigenvalues pi > ■ ■ ■ > pd- Then, we have pd < Aj - Aj < pi- As 
a result, 

I Aj - Aj |< max(l pi |, | pd I) =|| S;^ - II 2 (A.3) 

which proves equation (A.l). The equation (A.2) follows from Theorem 1 in Bickel and Levina 
(2008) and the same argument in the proof of Theorem 5 in the same paper. 

For the j-th diagonal entry of A'A - 1^, 

aydy - 1 = - II ay - ay 11^ /2 = Op{A^B^) < Op{AB) (A.4) 

as we assume AB goes to zero. For off-diagonal entry {k, j) (k j), 

la'ayl = |a;(ay-ay)| 

< II a; II 2 II ay - ay ||2=|| ay - ay II 2 
= OpiAB) (A.5) 

which proves the first result. 
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To prove the second result, we separate the left hand side of the equation into three parts: 


L/O-S/W-A'EoA 

= A'[i:,(0 - 2:,(0]A + A'2:,(0A - e/o - a'SqA 
= A' [£,(0 - 2:,(t)]A + [(A'A)'2: fit)A'A - E fit)] 

+[A'2:oA-A'2:oA] (A.6) 

For the first term on the right-hand side of equation (A.6), since 

l|A'|liliA||2=l, (A.7) 

we have 

II A'[i:,(0-2:.(0]A||2 < llA'iblls.co-s.COIbllAlb 

= OpiA) (A.8) 

For the second term, from Condition (A2), we know that || 2/(0 || 2 = OpiB), and we have 

||A-A||2 = ||(A'-A')(A-A)|i2 

< 2|| A'A-I,|i2=Op(A5) (A.9) 

As a result, 

II (A'A)'2/(0A'A - 2/(0 lb = II A'A2/(0A'A - A'A2/(0A'A lb 

< ||A'|bl|A2/(0A'-A2/(0A'|bl|A|b 

< II (A - A)2/(0A' + A2/(0(A - A)' |b 

< ||A-A|bl|2/(0lb(l|A|b + ||A|b) 

= OpiA^I^B^I^) (A. 10) 
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For the third term, again from Condition (A2), we know that || Eq II 2 is bounded, therefore. 


A%A - A'EoA II 2 = |i(A-A)%A + A%(A-A)||2 

< II (A - A)' II2II Eo II2II A |i2 + II A' II2II Eo II2II (A - A) ^ 

= II A-AlbllEolbdlAlb + IIAlb) 

= (A. 11) 


From equations (A.8), (A. 10) and (A. 11), we eonelude that 

tf{t) - 2/(0 - A'EoA = Op(A^/^B^/^). (A.12) 

□ 

We need the following three lemmas to prove Theorem 2. 


Lemma 1. d£.(6)ld9i and JL{9) j dOidOj are uniformly bounded for 0 6 0. 

Proof Sf(t) > CC > 0, so that > 0. Given S/(0), S/(-l), •••, S/(-p + 1), S/(0 

is a polynomial of A^nj, Bmn,hCmf^, for 1 < m < r and 1 < n < r, i.e. Sf(t) 6 C“. As a 
result, S/(0“' 6 C“. We have In(-), F(-) and tr(-) are all 6 C“, whieh indieates X(0) 6 C“, i.e. 
d£,(6)ld6i, d^£(6)ld6id6j 6 C“. Sinee 0 6 0 whieh is eompaet, we have the eonelusion that 
d£,{9)ld6i and d^£(9)/d9id9j are bounded on 0. □ 


Lemma 2. If we denote K(9,llf(t)) := X(0) - X(0) = Op(A^^^B^^^), then we have 


dK(9,tf(t)) 


d9i 




(A.13) 


Proof By straightforward ealeulations, it ean be shown that K{9,llf{t)) can be split into: 


Ki9,tf(t)) = Cdp,n) + F(0) + C2ip,n)G(9) 


(A. 14) 


35 



where Ci(p, n) = Op{A) and C 2 ip, n) = which are free of 9, while F(-), G(-) 6 C“, 

according to Lemma 1. In particular, all the derivatives of F and G are bounded on 0. As a 
consequence, 


dK{e,tf{t)) dF{e) dG{e) 

= Op{A^I^B^'^). (A. 15) 

□ 


Lemma 3. The log-likelihood function for the CAW model given observed data ^f{t) and true 
data ^f(t) are 


and 


m 




i=l 


t=[ 

+ C~"~^ )ln\tf(t)\ - l-tr(vSf(tr%m 


y + 1 - z y |S/(0| 

--- )--ln\^\ 

2 2 V 


(A.16) 


£(0) 




ln(n) - 2^ InTi ---) - -ln\ -1 


Tj^' 2 ^ ' 4 

^— ^-)ln\l2f{t)\ - l-tr(vSf(t)~'t.f(t))}, 


i=l 

-H 


(A.17) 


respectively. Then we have 

1(9) - £(9) = Op(A^I^B^I^) 


Proof By simple algebraic manipulations, we have 


l(9) = £(9) + 


1 ^ 

rZi 


y Sf(t) y — r - 


f=l 


1 ^f(t) 


1 


--tr(y(S/(0-'i:/(0 - Sf(tr%(t)))] 


(A.18) 


(A. 19) 


Since S/(t) is the conditional mean of E/(t), it can be proved that max, || S/(t) || 2 = Op(B) and 
S/(t) - S/(t) = Op(A^'^B^^^) as 2/(0 and 2/(0 - 2/(0- The basic idea of the proof is the fol- 
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lowing. We can treat Equation (10) as a linear map from (2/(0) to (S/(0): (S/(0) = X • (2/(0) 
where X is a linear operator with bounded norm. It follows that Vt, || S/(0 Il 2 < « sup^ || 2/(0 lb 
for some constant a, which gives the required bound. A similar argument can be applied to 
S/(0 - S/(0- 


For the first term of the additional part, since S/(0 = Op(B), we have 


S/(0-S/(0 

S/(0 


lb 




(A.20) 


Thus, 

For the second term, it is easy to prove that 2/(0 = Op{B), so that we have 


1 ^/(0-^/(O , 

tf(t) 


= Op{A^l^B^l^) 


(A.21) 


(A.22) 


Thus, 


tfit) tf(t) 

2/(0 2/(0 


l)-Op(\ 


^/(O , 

2/(0 


1 ) = OpiA^'^B^/^) 


(A.23) 


For the third term, first note that as S/(0 > CC', there exists a constant w > 0 such that the 
minimum eigenvalue of S/(0 is larger than or equal to w. Consequently, ||S/(0“^lb ^ ^ which 
is bounded. As a result 


||S/(0-'2/(0-S/(0-'2/(0lb 

< ||S/(0-'(2/(0 - i:/(0)lb + ll(S/(0“' - S/(0-')2/(0lb 

< ||S/(0-'ll2ll(2/(0 - 2/(0)112 + l|S/(0-'lbllS/(0 - S/(0lbllS/(0“'lbll2/(0lb 

= -OpiA^^^B^/^) + \Op(A^^^B^^^)Op(B) 

w 

= OpiA^'^B^'^) (A.24) 

□ 

Proof of Theorem 2. From Femma 3, we have £,(6) - L(6) = K{6,^f{t)) = Op{A^^^B^^'^). By 
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taking derivatives of both sides with respeet to any parameters from 9, we have 


dliO) _d£ie) 

dOi ~ dOi ^ dOi 


(A.25) 


By Lemma 2, we have 


0 = 


dZiO) dL{e) 


dOi 


dOi 

d£{e) 

dOi 




+ 


E 


d^£ 

dOidOj 


(e)(e - 9) + o„(9 - 9)+ 


d^£ 




(A.26) 




From Lemma 1, we know that d^£(9)/d9id9j are bounded for any 9. In addition, from Condi¬ 
tions (A4) and (A6), there exists some eonstant £ sueh that in probability, 


d^£ 

d9id9j 


( 0)1 >£>0 


uniformly. As a result, we have 


(A.27) 


9-9 = OpiA^I^B^I^) 


(A.28) 


□ 
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Figure Legends: 


Figure 2: 

The subfigure with title ”11” shows the time series of the variance for the first factor, the sub¬ 
figure with title ”21” shows that of the covariance between the first and second factors, and so 
on. 

Figure 3: 

The first subplot is for the first entry of the 6-dimensional vector, and so on. For each of the 
6 time series, we show the in-sample fit in the upper panel, where the solid line stands for the 
real data and the dashed line is the fitted series. The residuals are shown in the middle panel 
while the ACF and PACF of residuals are displayed in the lower panel. 

Figure 4: 

Two plots for realized variances and two plots for realized covariances are shown in Figure 5. 
AIG and BA are chosen for the realized variances while the realized covariances between AIG 
and C, and between AXP and BA are shown in the following figures. 
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